Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 115584 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 15.9 MiB |
| Average record size in memory | 144.0 B |
Variable types
| Categorical | 9 |
|---|---|
| Numeric | 9 |
accident_year has constant value "2020" | Constant |
accident_index has a high cardinality: 91199 distinct values | High cardinality |
accident_reference has a high cardinality: 91199 distinct values | High cardinality |
casualty_reference is highly correlated with car_passenger | High correlation |
casualty_class is highly correlated with pedestrian_location and 2 other fields | High correlation |
age_of_casualty is highly correlated with age_band_of_casualty | High correlation |
age_band_of_casualty is highly correlated with age_of_casualty | High correlation |
pedestrian_location is highly correlated with casualty_class and 2 other fields | High correlation |
pedestrian_movement is highly correlated with casualty_class and 2 other fields | High correlation |
car_passenger is highly correlated with casualty_reference and 1 other fields | High correlation |
casualty_type is highly correlated with pedestrian_location and 1 other fields | High correlation |
casualty_home_area_type is highly correlated with casualty_imd_decile | High correlation |
casualty_imd_decile is highly correlated with casualty_home_area_type | High correlation |
casualty_class is highly correlated with pedestrian_location and 1 other fields | High correlation |
age_of_casualty is highly correlated with age_band_of_casualty | High correlation |
age_band_of_casualty is highly correlated with age_of_casualty | High correlation |
pedestrian_location is highly correlated with casualty_class and 1 other fields | High correlation |
pedestrian_movement is highly correlated with casualty_class and 1 other fields | High correlation |
casualty_home_area_type is highly correlated with casualty_imd_decile | High correlation |
casualty_imd_decile is highly correlated with casualty_home_area_type | High correlation |
casualty_class is highly correlated with pedestrian_location and 1 other fields | High correlation |
age_of_casualty is highly correlated with age_band_of_casualty | High correlation |
age_band_of_casualty is highly correlated with age_of_casualty | High correlation |
pedestrian_location is highly correlated with casualty_class and 2 other fields | High correlation |
pedestrian_movement is highly correlated with casualty_class and 2 other fields | High correlation |
casualty_type is highly correlated with pedestrian_location and 1 other fields | High correlation |
sex_of_casualty is highly correlated with accident_year | High correlation |
casualty_severity is highly correlated with accident_year | High correlation |
accident_year is highly correlated with sex_of_casualty and 5 other fields | High correlation |
pedestrian_road_maintenance_worker is highly correlated with accident_year | High correlation |
casualty_home_area_type is highly correlated with accident_year | High correlation |
casualty_class is highly correlated with accident_year and 1 other fields | High correlation |
car_passenger is highly correlated with accident_year and 1 other fields | High correlation |
vehicle_reference is highly correlated with casualty_reference | High correlation |
casualty_reference is highly correlated with vehicle_reference | High correlation |
casualty_class is highly correlated with pedestrian_location and 2 other fields | High correlation |
age_of_casualty is highly correlated with age_band_of_casualty | High correlation |
age_band_of_casualty is highly correlated with age_of_casualty | High correlation |
pedestrian_location is highly correlated with casualty_class and 1 other fields | High correlation |
pedestrian_movement is highly correlated with casualty_class and 1 other fields | High correlation |
car_passenger is highly correlated with casualty_class | High correlation |
casualty_home_area_type is highly correlated with casualty_imd_decile | High correlation |
casualty_imd_decile is highly correlated with casualty_home_area_type | High correlation |
vehicle_reference is highly skewed (γ1 = 320.7263833) | Skewed |
casualty_reference is highly skewed (γ1 = 224.095606) | Skewed |
accident_index is uniformly distributed | Uniform |
accident_reference is uniformly distributed | Uniform |
pedestrian_location has 100834 (87.2%) zeros | Zeros |
pedestrian_movement has 100833 (87.2%) zeros | Zeros |
bus_or_coach_passenger has 114275 (98.9%) zeros | Zeros |
casualty_type has 14750 (12.8%) zeros | Zeros |
Reproduction
| Analysis started | 2022-02-22 14:08:06.803052 |
|---|---|
| Analysis finished | 2022-02-22 14:08:32.727717 |
| Duration | 25.92 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 91199 |
|---|---|
| Distinct (%) | 78.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 903.1 KiB |
| 2020440349165 | 41 |
|---|---|
| 2020990939366 | 19 |
| 2020140924772 | 17 |
| 2020460977371 | 13 |
| 2020470916576 | 12 |
| Other values (91194) |
Length
| Max length | 13 |
|---|---|
| Median length | 13 |
| Mean length | 13 |
| Min length | 13 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 74161 ? |
|---|---|
| Unique (%) | 64.2% |
Sample
| 1st row | 2020010219808 |
|---|---|
| 2nd row | 2020010220496 |
| 3rd row | 2020010220496 |
| 4th row | 2020010228005 |
| 5th row | 2020010228006 |
Common Values
| Value | Count | Frequency (%) |
| 2020440349165 | 41 | < 0.1% |
| 2020990939366 | 19 | < 0.1% |
| 2020140924772 | 17 | < 0.1% |
| 2020460977371 | 13 | < 0.1% |
| 2020470916576 | 12 | < 0.1% |
| 2020010237692 | 11 | < 0.1% |
| 2020170H10270 | 11 | < 0.1% |
| 2020350984206 | 11 | < 0.1% |
| 202006F172817 | 11 | < 0.1% |
| 2020160984754 | 10 | < 0.1% |
| Other values (91189) | 115428 |
Length
| Value | Count | Frequency (%) |
| 2020440349165 | 41 | < 0.1% |
| 2020990939366 | 19 | < 0.1% |
| 2020140924772 | 17 | < 0.1% |
| 2020460977371 | 13 | < 0.1% |
| 2020470916576 | 12 | < 0.1% |
| 2020010237692 | 11 | < 0.1% |
| 2020170h10270 | 11 | < 0.1% |
| 2020350984206 | 11 | < 0.1% |
| 202006f172817 | 11 | < 0.1% |
| 2020160984754 | 10 | < 0.1% |
| Other values (91189) | 115428 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 903.1 KiB |
| 2020 |
|---|
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2020 |
|---|---|
| 2nd row | 2020 |
| 3rd row | 2020 |
| 4th row | 2020 |
| 5th row | 2020 |
Common Values
| Value | Count | Frequency (%) |
| 2020 | 115584 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 2020 | 115584 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 91199 |
|---|---|
| Distinct (%) | 78.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 903.1 KiB |
| 440349165 | 41 |
|---|---|
| 990939366 | 19 |
| 140924772 | 17 |
| 460977371 | 13 |
| 470916576 | 12 |
| Other values (91194) |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 9 |
| Min length | 9 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 74161 ? |
|---|---|
| Unique (%) | 64.2% |
Sample
| 1st row | 010219808 |
|---|---|
| 2nd row | 010220496 |
| 3rd row | 010220496 |
| 4th row | 010228005 |
| 5th row | 010228006 |
Common Values
| Value | Count | Frequency (%) |
| 440349165 | 41 | < 0.1% |
| 990939366 | 19 | < 0.1% |
| 140924772 | 17 | < 0.1% |
| 460977371 | 13 | < 0.1% |
| 470916576 | 12 | < 0.1% |
| 010237692 | 11 | < 0.1% |
| 170H10270 | 11 | < 0.1% |
| 350984206 | 11 | < 0.1% |
| 06F172817 | 11 | < 0.1% |
| 160984754 | 10 | < 0.1% |
| Other values (91189) | 115428 |
Length
| Value | Count | Frequency (%) |
| 440349165 | 41 | < 0.1% |
| 990939366 | 19 | < 0.1% |
| 140924772 | 17 | < 0.1% |
| 460977371 | 13 | < 0.1% |
| 470916576 | 12 | < 0.1% |
| 010237692 | 11 | < 0.1% |
| 170h10270 | 11 | < 0.1% |
| 350984206 | 11 | < 0.1% |
| 06f172817 | 11 | < 0.1% |
| 160984754 | 10 | < 0.1% |
| Other values (91189) | 115428 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.460556824 |
| Minimum | 1 |
|---|---|
| Maximum | 999 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 903.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 2 |
| Maximum | 999 |
| Range | 998 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 2.991765428 |
|---|---|
| Coefficient of variation (CV) | 2.048373181 |
| Kurtosis | 106936.656 |
| Mean | 1.460556824 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 320.7263833 |
| Sum | 168817 |
| Variance | 8.950660379 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 67571 | |
| 2 | 44589 | |
| 3 | 2846 | 2.5% |
| 4 | 436 | 0.4% |
| 5 | 99 | 0.1% |
| 6 | 25 | < 0.1% |
| 7 | 6 | < 0.1% |
| 8 | 5 | < 0.1% |
| 9 | 2 | < 0.1% |
| 10 | 2 | < 0.1% |
| Other values (2) | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 67571 | |
| 2 | 44589 | |
| 3 | 2846 | 2.5% |
| 4 | 436 | 0.4% |
| 5 | 99 | 0.1% |
| 6 | 25 | < 0.1% |
| 7 | 6 | < 0.1% |
| 8 | 5 | < 0.1% |
| 9 | 2 | < 0.1% |
| 10 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 999 | 1 | < 0.1% |
| 11 | 2 | < 0.1% |
| 10 | 2 | < 0.1% |
| 9 | 2 | < 0.1% |
| 8 | 5 | < 0.1% |
| 7 | 6 | < 0.1% |
| 6 | 25 | < 0.1% |
| 5 | 99 | 0.1% |
| 4 | 436 | 0.4% |
| 3 | 2846 |
| Distinct | 43 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.347790352 |
| Minimum | 1 |
|---|---|
| Maximum | 992 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 903.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 992 |
| Range | 991 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 4.036720714 |
|---|---|
| Coefficient of variation (CV) | 2.995065745 |
| Kurtosis | 52821.99629 |
| Mean | 1.347790352 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 224.095606 |
| Sum | 155783 |
| Variance | 16.29511412 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 90228 | |
| 2 | 17657 | 15.3% |
| 3 | 5020 | 4.3% |
| 4 | 1689 | 1.5% |
| 5 | 571 | 0.5% |
| 6 | 203 | 0.2% |
| 7 | 83 | 0.1% |
| 8 | 35 | < 0.1% |
| 9 | 19 | < 0.1% |
| 10 | 13 | < 0.1% |
| Other values (33) | 66 | 0.1% |
| Value | Count | Frequency (%) |
| 1 | 90228 | |
| 2 | 17657 | 15.3% |
| 3 | 5020 | 4.3% |
| 4 | 1689 | 1.5% |
| 5 | 571 | 0.5% |
| 6 | 203 | 0.2% |
| 7 | 83 | 0.1% |
| 8 | 35 | < 0.1% |
| 9 | 19 | < 0.1% |
| 10 | 13 | < 0.1% |
| Value | Count | Frequency (%) |
| 992 | 1 | |
| 902 | 1 | |
| 41 | 1 | |
| 40 | 2 | |
| 39 | 1 | |
| 38 | 1 | |
| 37 | 1 | |
| 36 | 1 | |
| 35 | 1 | |
| 34 | 1 |
casualty_class
Categorical
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 903.1 KiB |
| 1 | |
|---|---|
| 2 | |
| 3 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3 |
|---|---|
| 2nd row | 3 |
| 3rd row | 3 |
| 4th row | 3 |
| 5th row | 3 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 79330 | |
| 2 | 21504 | 18.6% |
| 3 | 14750 | 12.8% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 79330 | |
| 2 | 21504 | 18.6% |
| 3 | 14750 | 12.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 903.1 KiB |
| 1 | |
|---|---|
| 2 | |
| -1 | 756 |
| 9 | 5 |
Length
| Max length | 2 |
|---|---|
| Median length | 1 |
| Mean length | 1.006540698 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 2 |
| 3rd row | 2 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 72335 | |
| 2 | 42488 | |
| -1 | 756 | 0.7% |
| 9 | 5 | < 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 73091 | |
| 2 | 42488 | |
| 9 | 5 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 101 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 36.48974772 |
| Minimum | -1 |
|---|---|
| Maximum | 99 |
| Zeros | 130 |
| Zeros (%) | 0.1% |
| Negative | 2481 |
| Negative (%) | 2.1% |
| Memory size | 903.1 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 23 |
| median | 33 |
| Q3 | 50 |
| 95-th percentile | 72 |
| Maximum | 99 |
| Range | 100 |
| Interquartile range (IQR) | 27 |
Descriptive statistics
| Standard deviation | 18.98502214 |
|---|---|
| Coefficient of variation (CV) | 0.5202837324 |
| Kurtosis | -0.2047937287 |
| Mean | 36.48974772 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 0.4471627493 |
| Sum | 4217631 |
| Variance | 360.4310656 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 30 | 3135 | 2.7% |
| 19 | 2881 | 2.5% |
| 20 | 2796 | 2.4% |
| 25 | 2749 | 2.4% |
| 26 | 2739 | 2.4% |
| 28 | 2738 | 2.4% |
| 23 | 2718 | 2.4% |
| 22 | 2718 | 2.4% |
| 24 | 2699 | 2.3% |
| 18 | 2686 | 2.3% |
| Other values (91) | 87725 |
| Value | Count | Frequency (%) |
| -1 | 2481 | |
| 0 | 130 | 0.1% |
| 1 | 186 | 0.2% |
| 2 | 295 | 0.3% |
| 3 | 349 | 0.3% |
| 4 | 436 | 0.4% |
| 5 | 436 | 0.4% |
| 6 | 431 | 0.4% |
| 7 | 529 | 0.5% |
| 8 | 491 | 0.4% |
| Value | Count | Frequency (%) |
| 99 | 2 | < 0.1% |
| 98 | 7 | < 0.1% |
| 97 | 2 | < 0.1% |
| 96 | 13 | < 0.1% |
| 95 | 21 | < 0.1% |
| 94 | 17 | < 0.1% |
| 93 | 38 | |
| 92 | 49 | |
| 91 | 66 | |
| 90 | 93 |
age_band_of_casualty
Real number (ℝ)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.292609704 |
| Minimum | -1 |
|---|---|
| Maximum | 11 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 2481 |
| Negative (%) | 2.1% |
| Memory size | 903.1 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 6 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 11 |
| Range | 12 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.392856268 |
|---|---|
| Coefficient of variation (CV) | 0.3802645294 |
| Kurtosis | 0.6772990663 |
| Mean | 6.292609704 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.5028817423 |
| Sum | 727325 |
| Variance | 5.725761117 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6 | 25511 | |
| 7 | 17805 | |
| 8 | 15669 | |
| 5 | 13568 | |
| 4 | 11627 | |
| 9 | 10390 | |
| 10 | 5337 | 4.6% |
| 3 | 4740 | 4.1% |
| 11 | 4025 | 3.5% |
| 2 | 2599 | 2.2% |
| Other values (2) | 4313 | 3.7% |
| Value | Count | Frequency (%) |
| -1 | 2481 | 2.1% |
| 1 | 1832 | 1.6% |
| 2 | 2599 | 2.2% |
| 3 | 4740 | 4.1% |
| 4 | 11627 | |
| 5 | 13568 | |
| 6 | 25511 | |
| 7 | 17805 | |
| 8 | 15669 | |
| 9 | 10390 |
| Value | Count | Frequency (%) |
| 11 | 4025 | 3.5% |
| 10 | 5337 | 4.6% |
| 9 | 10390 | |
| 8 | 15669 | |
| 7 | 17805 | |
| 6 | 25511 | |
| 5 | 13568 | |
| 4 | 11627 | |
| 3 | 4740 | 4.1% |
| 2 | 2599 | 2.2% |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 903.1 KiB |
| 3 | |
|---|---|
| 2 | |
| 1 | 1460 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3 |
|---|---|
| 2nd row | 3 |
| 3rd row | 3 |
| 4th row | 3 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 94022 | |
| 2 | 20102 | 17.4% |
| 1 | 1460 | 1.3% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 3 | 94022 | |
| 2 | 20102 | 17.4% |
| 1 | 1460 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
pedestrian_location
Real number (ℝ)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.6968611573 |
| Minimum | -1 |
|---|---|
| Maximum | 10 |
| Zeros | 100834 |
| Zeros (%) | 87.2% |
| Negative | 2 |
| Negative (%) | < 0.1% |
| Memory size | 903.1 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 5 |
| Maximum | 10 |
| Range | 11 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 2.059929866 |
|---|---|
| Coefficient of variation (CV) | 2.956011889 |
| Kurtosis | 8.474022246 |
| Mean | 0.6968611573 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.047900729 |
| Sum | 80546 |
| Variance | 4.243311051 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 100834 | |
| 5 | 5828 | 5.0% |
| 1 | 2419 | 2.1% |
| 6 | 1709 | 1.5% |
| 9 | 1614 | 1.4% |
| 10 | 1393 | 1.2% |
| 4 | 845 | 0.7% |
| 8 | 757 | 0.7% |
| 7 | 91 | 0.1% |
| 2 | 70 | 0.1% |
| Other values (2) | 24 | < 0.1% |
| Value | Count | Frequency (%) |
| -1 | 2 | < 0.1% |
| 0 | 100834 | |
| 1 | 2419 | 2.1% |
| 2 | 70 | 0.1% |
| 3 | 22 | < 0.1% |
| 4 | 845 | 0.7% |
| 5 | 5828 | 5.0% |
| 6 | 1709 | 1.5% |
| 7 | 91 | 0.1% |
| 8 | 757 | 0.7% |
| Value | Count | Frequency (%) |
| 10 | 1393 | 1.2% |
| 9 | 1614 | 1.4% |
| 8 | 757 | 0.7% |
| 7 | 91 | 0.1% |
| 6 | 1709 | 1.5% |
| 5 | 5828 | |
| 4 | 845 | 0.7% |
| 3 | 22 | < 0.1% |
| 2 | 70 | 0.1% |
| 1 | 2419 |
pedestrian_movement
Real number (ℝ)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5615915698 |
| Minimum | -1 |
|---|---|
| Maximum | 9 |
| Zeros | 100833 |
| Zeros (%) | 87.2% |
| Negative | 2 |
| Negative (%) | < 0.1% |
| Memory size | 903.1 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 4 |
| Maximum | 9 |
| Range | 10 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.879680074 |
|---|---|
| Coefficient of variation (CV) | 3.347058921 |
| Kurtosis | 13.20460614 |
| Mean | 0.5615915698 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.749485504 |
| Sum | 64911 |
| Variance | 3.53319718 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 100833 | |
| 1 | 4593 | 4.0% |
| 9 | 4132 | 3.6% |
| 3 | 3093 | 2.7% |
| 5 | 776 | 0.7% |
| 2 | 746 | 0.6% |
| 4 | 569 | 0.5% |
| 8 | 417 | 0.4% |
| 7 | 331 | 0.3% |
| 6 | 92 | 0.1% |
| Value | Count | Frequency (%) |
| -1 | 2 | < 0.1% |
| 0 | 100833 | |
| 1 | 4593 | 4.0% |
| 2 | 746 | 0.6% |
| 3 | 3093 | 2.7% |
| 4 | 569 | 0.5% |
| 5 | 776 | 0.7% |
| 6 | 92 | 0.1% |
| 7 | 331 | 0.3% |
| 8 | 417 | 0.4% |
| Value | Count | Frequency (%) |
| 9 | 4132 | 3.6% |
| 8 | 417 | 0.4% |
| 7 | 331 | 0.3% |
| 6 | 92 | 0.1% |
| 5 | 776 | 0.7% |
| 4 | 569 | 0.5% |
| 3 | 3093 | 2.7% |
| 2 | 746 | 0.6% |
| 1 | 4593 | 4.0% |
| 0 | 100833 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 903.1 KiB |
| 0 | |
|---|---|
| 1 | |
| 2 | 6543 |
| -1 | 311 |
| 9 | 117 |
Length
| Max length | 2 |
|---|---|
| Median length | 1 |
| Mean length | 1.002690684 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 96655 | |
| 1 | 11958 | 10.3% |
| 2 | 6543 | 5.7% |
| -1 | 311 | 0.3% |
| 9 | 117 | 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 96655 | |
| 1 | 12269 | 10.6% |
| 2 | 6543 | 5.7% |
| 9 | 117 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.03895002769 |
| Minimum | -1 |
|---|---|
| Maximum | 9 |
| Zeros | 114275 |
| Zeros (%) | 98.9% |
| Negative | 22 |
| Negative (%) | < 0.1% |
| Memory size | 903.1 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 9 |
| Range | 10 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.3815276336 |
|---|---|
| Coefficient of variation (CV) | 9.795310974 |
| Kurtosis | 112.1954802 |
| Mean | 0.03895002769 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 10.2245913 |
| Sum | 4502 |
| Variance | 0.1455633352 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 114275 | |
| 4 | 796 | 0.7% |
| 3 | 350 | 0.3% |
| 2 | 77 | 0.1% |
| 1 | 55 | < 0.1% |
| -1 | 22 | < 0.1% |
| 9 | 9 | < 0.1% |
| Value | Count | Frequency (%) |
| -1 | 22 | < 0.1% |
| 0 | 114275 | |
| 1 | 55 | < 0.1% |
| 2 | 77 | 0.1% |
| 3 | 350 | 0.3% |
| 4 | 796 | 0.7% |
| 9 | 9 | < 0.1% |
| Value | Count | Frequency (%) |
| 9 | 9 | < 0.1% |
| 4 | 796 | 0.7% |
| 3 | 350 | 0.3% |
| 2 | 77 | 0.1% |
| 1 | 55 | < 0.1% |
| 0 | 114275 | |
| -1 | 22 | < 0.1% |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 903.1 KiB |
| 0 | |
|---|---|
| 2 | 745 |
| -1 | 94 |
| 1 | 73 |
Length
| Max length | 2 |
|---|---|
| Median length | 1 |
| Mean length | 1.000813261 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 114672 | |
| 2 | 745 | 0.6% |
| -1 | 94 | 0.1% |
| 1 | 73 | 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 114672 | |
| 2 | 745 | 0.6% |
| 1 | 167 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.388366902 |
| Minimum | 0 |
|---|---|
| Maximum | 98 |
| Zeros | 14750 |
| Zeros (%) | 12.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 903.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 9 |
| Q3 | 9 |
| 95-th percentile | 11 |
| Maximum | 98 |
| Range | 98 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 9.914713615 |
|---|---|
| Coefficient of variation (CV) | 1.341935741 |
| Kurtosis | 56.13020795 |
| Mean | 7.388366902 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.75436421 |
| Sum | 853977 |
| Variance | 98.30154606 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 9 | 62698 | |
| 1 | 16294 | 14.1% |
| 0 | 14750 | 12.8% |
| 3 | 6993 | 6.1% |
| 5 | 3677 | 3.2% |
| 19 | 3235 | 2.8% |
| 4 | 1546 | 1.3% |
| 11 | 1506 | 1.3% |
| 8 | 1419 | 1.2% |
| 2 | 1001 | 0.9% |
| Other values (11) | 2465 | 2.1% |
| Value | Count | Frequency (%) |
| 0 | 14750 | 12.8% |
| 1 | 16294 | 14.1% |
| 2 | 1001 | 0.9% |
| 3 | 6993 | 6.1% |
| 4 | 1546 | 1.3% |
| 5 | 3677 | 3.2% |
| 8 | 1419 | 1.2% |
| 9 | 62698 | |
| 10 | 138 | 0.1% |
| 11 | 1506 | 1.3% |
| Value | Count | Frequency (%) |
| 98 | 209 | 0.2% |
| 97 | 301 | 0.3% |
| 90 | 695 | 0.6% |
| 23 | 86 | 0.1% |
| 22 | 155 | 0.1% |
| 21 | 447 | 0.4% |
| 20 | 263 | 0.2% |
| 19 | 3235 | |
| 18 | 3 | < 0.1% |
| 17 | 82 | 0.1% |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 903.1 KiB |
| 1 | |
|---|---|
| 3 | |
| -1 | |
| 2 |
Length
| Max length | 2 |
|---|---|
| Median length | 1 |
| Mean length | 1.093109773 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 85122 | |
| 3 | 10860 | 9.4% |
| -1 | 10762 | 9.3% |
| 2 | 8840 | 7.6% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 95884 | |
| 3 | 10860 | 9.4% |
| 2 | 8840 | 7.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.36113995 |
| Minimum | -1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 10910 |
| Negative (%) | 9.4% |
| Memory size | 903.1 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | -1 |
| Q1 | 2 |
| median | 4 |
| Q3 | 7 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 11 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.171409748 |
|---|---|
| Coefficient of variation (CV) | 0.7271974263 |
| Kurtosis | -0.9571051199 |
| Mean | 4.36113995 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.08002709433 |
| Sum | 504078 |
| Variance | 10.05783979 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 13604 | |
| 1 | 12848 | |
| 3 | 12826 | |
| 4 | 11597 | |
| -1 | 10910 | |
| 5 | 10863 | |
| 6 | 10225 | |
| 7 | 9255 | |
| 8 | 8438 | |
| 9 | 8068 |
| Value | Count | Frequency (%) |
| -1 | 10910 | |
| 1 | 12848 | |
| 2 | 13604 | |
| 3 | 12826 | |
| 4 | 11597 | |
| 5 | 10863 | |
| 6 | 10225 | |
| 7 | 9255 | |
| 8 | 8438 | |
| 9 | 8068 |
| Value | Count | Frequency (%) |
| 10 | 6950 | |
| 9 | 8068 | |
| 8 | 8438 | |
| 7 | 9255 | |
| 6 | 10225 | |
| 5 | 10863 | |
| 4 | 11597 | |
| 3 | 12826 | |
| 2 | 13604 | |
| 1 | 12848 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| accident_index | accident_year | accident_reference | vehicle_reference | casualty_reference | casualty_class | sex_of_casualty | age_of_casualty | age_band_of_casualty | casualty_severity | pedestrian_location | pedestrian_movement | car_passenger | bus_or_coach_passenger | pedestrian_road_maintenance_worker | casualty_type | casualty_home_area_type | casualty_imd_decile | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2020010219808 | 2020 | 010219808 | 1 | 1 | 3 | 1 | 31 | 6 | 3 | 9 | 5 | 0 | 0 | 0 | 0 | 1 | 4 |
| 1 | 2020010220496 | 2020 | 010220496 | 1 | 1 | 3 | 2 | 2 | 1 | 3 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 2 |
| 2 | 2020010220496 | 2020 | 010220496 | 1 | 2 | 3 | 2 | 4 | 1 | 3 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 2 |
| 3 | 2020010228005 | 2020 | 010228005 | 1 | 1 | 3 | 1 | 23 | 5 | 3 | 5 | 9 | 0 | 0 | 0 | 0 | 1 | 3 |
| 4 | 2020010228006 | 2020 | 010228006 | 1 | 1 | 3 | 1 | 47 | 8 | 2 | 4 | 1 | 0 | 0 | 0 | 0 | 1 | 3 |
| 5 | 2020010228011 | 2020 | 010228011 | 1 | 1 | 3 | 2 | 32 | 6 | 3 | 6 | 9 | 0 | 0 | 0 | 0 | 1 | 8 |
| 6 | 2020010228011 | 2020 | 010228011 | 1 | 2 | 3 | 2 | 33 | 6 | 3 | 6 | 9 | 0 | 0 | 0 | 0 | -1 | -1 |
| 7 | 2020010228012 | 2020 | 010228012 | 1 | 1 | 1 | 1 | 25 | 5 | 3 | 0 | 0 | 0 | 0 | 0 | 9 | 1 | 4 |
| 8 | 2020010228014 | 2020 | 010228014 | 1 | 1 | 1 | 1 | 41 | 7 | 3 | 0 | 0 | 0 | 0 | 0 | 9 | 1 | 3 |
| 9 | 2020010228017 | 2020 | 010228017 | 1 | 1 | 3 | 1 | 50 | 8 | 2 | 9 | 9 | 0 | 0 | 0 | 0 | 1 | 3 |
Last rows
| accident_index | accident_year | accident_reference | vehicle_reference | casualty_reference | casualty_class | sex_of_casualty | age_of_casualty | age_band_of_casualty | casualty_severity | pedestrian_location | pedestrian_movement | car_passenger | bus_or_coach_passenger | pedestrian_road_maintenance_worker | casualty_type | casualty_home_area_type | casualty_imd_decile | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 115574 | 2020991023880 | 2020 | 991023880 | 1 | 1 | 3 | 2 | 58 | 9 | 3 | 5 | 1 | 0 | 0 | 0 | 0 | 1 | 4 |
| 115575 | 2020991024039 | 2020 | 991024039 | 2 | 1 | 1 | 1 | 52 | 8 | 3 | 0 | 0 | 0 | 0 | 0 | 9 | 3 | 4 |
| 115576 | 2020991024209 | 2020 | 991024209 | 1 | 1 | 1 | 2 | 33 | 6 | 3 | 0 | 0 | 0 | 0 | 0 | 9 | 1 | 10 |
| 115577 | 2020991024209 | 2020 | 991024209 | 1 | 2 | 2 | 2 | 13 | 3 | 3 | 0 | 0 | 2 | 0 | 0 | 9 | 1 | 8 |
| 115578 | 2020991024526 | 2020 | 991024526 | 1 | 1 | 3 | 1 | 69 | 10 | 3 | 6 | 9 | 0 | 0 | 0 | 0 | 3 | 7 |
| 115579 | 2020991027064 | 2020 | 991027064 | 2 | 1 | 1 | 1 | 11 | 3 | 2 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 2 |
| 115580 | 2020991029573 | 2020 | 991029573 | 1 | 1 | 3 | 2 | 63 | 9 | 3 | 10 | 1 | 0 | 0 | 0 | 0 | 1 | 10 |
| 115581 | 2020991030297 | 2020 | 991030297 | 2 | 1 | 1 | 1 | 38 | 7 | 2 | 0 | 0 | 0 | 0 | 0 | 5 | 2 | 9 |
| 115582 | 2020991030900 | 2020 | 991030900 | 2 | 1 | 1 | 1 | 76 | 11 | 3 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 9 |
| 115583 | 2020991032575 | 2020 | 991032575 | 1 | 1 | 3 | 1 | 48 | 8 | 3 | 9 | 9 | 0 | 0 | 0 | 0 | 1 | 1 |